Geometric measures of redundance and 
irrelevance tradeoff exponent to choose 
suitable delay times for continuous systems 



Xiaodong Luo 1 Michael Small 

Department of Electronic and Information Engineering, Hong Kong Polytechnic 
University, Hung Horn, Hong Kong. 



Abstract 

Using the concept of the geometric measures of redundance and irrelevancetradeoff 
exponent (RITE), we present a new method to determine suitable delay times for 
continuous systems. After applying the RITE algorithm to both simulation and 
experimental observations, we find the results obtained are close to those obtained 
from the criterion of the average mutual information (AMI), while the RITE al- 
gorithm has the following advantages: simple implementation, reasonable computa- 
tional cost and robust performance against observational noise. 
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1 INTRODUCTION 



Since the embedding theorem of Takens [1] appeared, a number of papers 
have been published on criteria for estimating a suitable delay time for a 
nonlinear time series. One criterion, based on the second order autocorrelation 
(SO AC), chooses the time as the delay when the SO AC first becomes zero 
or drops to a certain fraction of its initial value [2]. This method is simple 
to implement, but it lacks a universal fraction for different systems to obtain 
suitable delay times 2 . As a generalisation of the above idea, Albano et al. 
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2 For example, the first zero criterion is successful for the Rossler system but it 
fails for the Lorenz system. 
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[3] proposed a heuristic idea. They take the time at the consistent extrema 
of different higher-order autocorrelation functions as candidates for a suitable 
embedding window, therefore if we choose an embedding dimension, we also 
choose a delay time. As a further step, having noticed that the SO AC is 
actually a linear measure of dependence, Fraser & Swinney [5] introduced an 
important statistic, mutual information, based on information theory. Mutual 
information is a nonlinear measure of dependence between two data sets, for 
a scalar time series, we can use the average mutual information (AMI) to 
select a proper delay time. The criterion is to take the time at the first local 
minimum of the AMI as the desired delay time. Mutual information is a 
valuable concept, but it is rather complex to implement. In addition, it was 
found its performance was not very robust for small data sets [7]. 



From other viewpoints, some criteria were proposed based on the utilization of 
the geometric information of the reconstructed attractor in embedding space. 
Buzug and Pfister [8] devised the fill-factor algorithm to determine a suit- 
able delay time by examining the attractor's expansion in embedding space. 
It will be selected as the suitable delay time when the fill-factor is maximized. 
But this algorithm also takes into account the situation of "overfolding", and 
more seriously, it will fail to yield significant delays if the attractor has more 
than one unstable focus [8]. As a solution, Buzug and Pfister designed another 
algorithm, integral local deformation (ILD). This algorithm will choose a suit- 
able delay time when the attractor's local minimum deformation is achieved 
. Comparatively, this algorithm needs substantially more computational time 
than the fill-factor algorithm, and as we will indicate in the later section, it 
might be more suitable to use this algorithm to choose embedding window 
rather than delay time. 



There have been many other criteria proposed. For example, Rosenstein [4] 
developed an approach named reconstruction signal strength resting on the 
concepts of redundance error and irrelevance error, their approach is compu- 
tationally efficient and can obtain a satisfactory performance, but the criterion 
for choosing suitable delay times is somewhat empirical. In this communica- 
tion we do not intend to provide a detailed review, readers are invited to refer 
to the literature [9] and [10] and references therein for more details. 



In the remaining sections, firstly we will propose a new algorithm to choose 
suitable delay times based on the concept of the geometric measures of redun- 
dance and irrelevance tradeoff exponent {RITE). Then we will examine the 
performance of this algorithm by applying it to data sets from both simulation 
and experimental observations. Finally we have a summary. 
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2 THE ALGORITHM OF RITE 



In a very recent paper Cellucci and coworkers [11] state their viewpoint on 
embedding methods as: A circular logic has resulted in which embedding cri- 
teria are assessed by an adjudicating criterion which is itself an embedding 
criterion. Following this viewpoint, we learn that the best embedding criteria 
might differ under different adjudicating criteria. Hence we would like to eluci- 
date that we do not seek the best embedding criteria for different adjudicating 
criteria, instead we intend to let our adjudicating criterion fit as many cases 
as possible. 

As we have known, sufficiently high embedding dimension is a necessary but 
not sufficient condition to form an embedding reconstruction according to the 
embedding theorem of Takens. To be an embedding by itself will impose two 
constraints on the reconstruction mapping ^ : R — > R m , where m is embedding 
dimension. One is that \I/ shall be a one-to-one mapping, the other is that the 
derivative mapping D • ^ shall also be one-to-one [6], where D denotes the 
differentiating operator on ^. 

In practice, although some delay times no longer lead to an embedding recon- 
struction (unlike the ideal situations), which will be discussed in the following 
content, it is hoped there are at least some others remaining. We note that, 
these remaining delay times equivalently lead to an embedding in the sense 
of characterizing the reconstructed attractor, although some particular values 
might indeed facilitate the analysis of a time series. Hence our adjudicating 
criterion is to guarantee the reconstruction mapping to be an embedding, 
and even if we obtain different delay times from different algorithms, we still 
consider them as suitable candidates for an embedding reconstruction. 

For a delay time embedding reconstruction, a scalar time series {xi : i = 
1,2, ... , N} is used to construct vectors Xi = Xi +T , . . . , Xj + ( m _iw) in R m , 
where m is embedding dimension and r is delay time. Now let us consider the 
effects of different delay times on the reconstructed attractor. Without losing 
generality, we confine our discussions to the two-dimensional embedding space 
Xi-\- T VS. X{. Fig. 1 demonstrates the reconstructed attractors of the Lorenz 
system [6] for three different delay times. When r is too small, then Xi +T will 
be very close to due to the continuity of the manifold. Therefore the pair 
points (xi,x i+T ) will distribute around the unity line indicated 
in Fig. 1 (a). But in practice, the presence of noise will let an embedding 
vector Xi = (x^, x i+T , . . . , x i+ ( m _i) T ) distributed as a "ball" rather than a 
point in metric space R m . The balls of adjacent vectors might intersect with 
each other, hence the reconstruction mapping ^ : R — > R m is not one-to-one 
and no longer an embedding. When delay time r is too large, say r = 32 as 
adopted in Fig. 1 (c), the reconstructed attractor is overfolded and does not 
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Fig. 1. Effects of different delay times on the reconstructed attractor of the Lorenz 
system in the two-dimensional embedding space, (a) delay time=2; (b) delay 
time=8; (c) delay time= 32. 

preserve the geometric structure of the original attractor comparing to those 
in panel (a) and (b), which also means the reconstruction is not an embedding. 

From the viewpoint of information theory, delay time r is too small means that 
x i+T contain mainly redundant information of Xj. This is called redundance. 
If delay time is too large, then for chaotic systems, Xi +T will be irrelevant to 
Xi, hence Xi +T contains no information of X{. This is known as irrelevance. As 
Liebert and Schuster [12] have argued that, we shall consider not only the 
effect of redundance but also that of irrelevance in estimate of suitable delay 
times. Therefore a tradeoff shall be achieved between redundance and irrele- 
vance so as to guarantee the reconstruction mapping to be an embedding. We 
define following statistic, namely redundance and irrelevance tradeoff exponent 
(RITE), to measure the tradeoff, 



RITE 



p(x u x l+T ) (xp + jl-pix^x^))^ 



where (•) denotes the expectation taken over time % and 



COV^Xi, 

var(xi) 



{p^i^i+r) (pi 
(xf) - (Xif 



(2) 



where p(xi,x i+T ) is the SOAC, cov(xi,x i+T ) and var(xi) are the covariance 
with delay time r and the variance of the time series {x,} respectively. After 
simplifications, we have 



RITE = 



(%i%i+r ) 

(^ 2 ) + (Xi)' 



(3) 



As we shall see in the following content, Eqn. (3) is only a constant affinc 
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Fig. 2. Geometric variables in the two-dimensional embedding space. 

transformation of the SOAC if directly applied to the original time series {xi}. 
Before that let us first interpret the meaning of Eqn. (1). We take (xf) as the 
case of complete redundance for the measure (xjXj +r ), when delay time r tends 
to zero and referring to x i+T brings no more information of Xj. Conversely, 
(xi) 2 is the case of complete irrelevance for the measure (xjXj +r ), when Xi +T 
is irrelevant and thus uncorrelated to Xj, hence (xjXj +r ) is reduced to (xi) 2 . 
The SOAC p(xi, Xi +T ) plays the role to measure the redundance between Xj +T 
and Xi with a weight of (xf) j (xf) + (x^ 2 , while 1 — p(xj,Xj +r ) denotes the 

measure of irrelevance with the assigned weight of (xi) 2 j (xf) + (xi) 2 . Starting 
from r = 0, as delay time r increases, the redundance measure p(x iy x i+T ) shall 
usually decrease while the irrelevance measure 1 — p(xi,Xi +T ) shall increase, 
hence a natural criterion is to choose the suitable delay time at the first local 
minimum of RITE, which guarantees the reconstruction to be an embedding 
in an optimal way according to Eqn. (1). 

If directly applying the measure of RITE to measure the original scalar time 
series {xj}, we can find from Eqn. (1) it is a trivial measure with the same 
performance as that of the SOAC since (x 2 ) and (xi) 2 are both independent 
of delay time r. A remedy is that, we can equivalently characterize the recon- 
structed attractor in the two-dimensional embedding space Xi +T vs. Xi instead 
of in the time domain. 



Let (xiXi+ T ) denote the vector from the origin to point (xj,Xj+ T ) in the two 
dimensional embedding space, as shown in Fig. 2, we have the distance di of 
the pair points (xj, x i+T ) to the identity line x i+T = Xj expressed by: 
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Fig. 3. Figure in the left panel indicates the average integral local deformation 
vs. delay time for the time series from the Lorenz system with 9000 data points. 
The number of reference points is 500, radius for neighbour searching is set to 5. 
Embedding dimension m varies from 2 to 6 (from upper to lower) and delay time 
increases from 1 to 50. Figure in the right panel adopts the same parameters as the 
left for calculations except that the time series is shorter, contisting of only 1200 
data points. 



di = 



V2 



l+T 



Xi 



(4) 



where |-| denotes the distance in Euclidean space. The projection length p, L of 
> 

vector (xi x i+T ) onto the identity line is: 



1 

72 



Pi l^z+r %i 



(5) 



Therefore the angle between vector (xiX i+T ) and the identity line is : 



9j = tan 



-i 



•^i-\-r %i 



(6) 



From Eqn. (4), (5) and (6), we obtain three new time series {di}, {pi} and 
derived from the original one which consist of geometric description variables 
of the reconstructed attractor in the two-dimensional embedding space. These 
geometric variables shall also be continuous in the time domain since all of 
the three above transforms are continuous. We apply the measure of RITE 
to these geometric variables with the same criterion to choose suitable delay 
times as having stated above, i.e., a suitable delay time will be chosen at the 
first local minimum of the geometric measures of RITE. 
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3 NUMERICAL RESULTS 



We note that if the origin in the embedding space of a time series is marginal to 
or even outside of the reconstructed attractor, the sensitivity of the geometric 
measures of RITE to different delay times will be significantly reduced. We 
therefore conduct the following smooth affine transform on the original time 
series {x^}. 

Xi 



Vi = 



^var(xi) 



(7) 



The new time series shall have the same dynamical properties in the time 
domain as the original time series has, while it takes the origin of embedding 
space as the "center" of its reconstructed attractor in a statistical sense. With 
this consideration, we prefer to studying the time series {i/i} rather than {x,}. 
In addition, we will discard the scale factor 1 j y/2 of both Eqn. (4) and (5) 
in all of our calculations without affecting the results. 

We will study the simulation data sets from the Lorenz and Rossler systems 
[6]. For the Lorenz system, the equations are: 



x(t) = a{y{t) - x(t)) 

y(t) = rx(t) - y(t) - x(t)z(t) 

z(t) = x(t)y(t) - bz(t) 



(8) 



with parameters a = 10, r = 28, c = 8 /3 and the sampling time At s = 0.02s. 
For the Rossler system, the equations are: 

x(t) = -y(t) - z(t) 

y{t) = x(t) + ay(t) (9) 
z(t) =b + z(t)(x(t) - c) 

with parameters a = 0.15, b = 0.20, c = 10.00 and the sampling time At s = 
0.1s. 



We will also apply the geometric measures of RITE to the sunspot record 
from year 1700 to year 1987 and infant respiratory data during stage 4 sleep 
(S4) [13]. In addition, we will calculate delay times chosen by the ILD and 
AMI algorithms for the comparison purpose. Our results are listed in Table 
1. 

Although the ILD algorithm was originally designed to determine suitable de- 
lay times r, it might be more appropriate to utilize it in establishing embedding 
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Table 1 



Delay times chosen by the algorithms of ILD, AMI and the 
geometric measures of RITE. 



Data set 


ILD 


AMI 


Geometric 


measures 


of RITE 




r/m 




distance 


projection 


angle 


Lorenz 


9/4 


8 


9 


12 


10 


Flossier 


10/3 


16 


16 


14 


13 


Sunspot 


2 


3 


2 


3 


2 


S4 


5/5 


8 


7 


8 


5 



window m ■ r. As indicated in the left panel of Fig. 3, when using Eqn. (24) 
in Ref. [8] for calculations, for the Lorenz system the products of embedding 
dimensions m (m >correlation dimension d c ) and their corresponding delay 
times r at the first local minimum of the average ILD are nearly a constant 
of 36. This conclusion also holds for data sets of the Rossler system and S4. 
In contrast, the sunspot record has consistent local minima and the products 
of m ■ t do not keep constant. This still does not contradict our conclusion as 
the sunspot record is an extremely short time series. As shown in the right 
panel of Fig. 3, the constant embedding window will vanish when the time 
series from the Lorenz system is shorter, instead a consistent local minimum 
appears at r = 8. 

Since embedding window m ■ r remains constant, different delay times will be 
obtained from the ILD algorithm for different embedding dimensions, never- 
theless, we think the ILD algorithm can still indicate how to obtain a proper 
embedding reconstruction with sufficiently high embedding dimension. Firstly 
we need to choose an optimal embedding dimension for each data set (except 
for the sunspot record) under the criterion of global False Nearest Neighbours 
(GFNN) [15] [11], then we can obtain the corresponding delay time according 
to embedding window 3 . For the sunspot record we choose delay time at the 
first consistent local minimum of the average ILD. The results are indicated 
in Table 1. 

From Table 1 we find that, loosely speaking, the results of different algorithms 
are close to each other. As we have stated in the previous section, although 
delay times chosen by different geometric measures of RITE and the other 
two algorithms are usually different, we still take all of them as the suitable 
candidates for an embedding reconstruction. 

3 It has to admit it is somewhat "circular" in this situation,since the choice of 
the optimal embedding dimension by GFNN algorithm in turn needs to take the 
suitable delay time as a parameter . In our calculation, we use the suitable delay 
time obtained by the AMI algorithm in Table 1 as the parameter to determine the 
optimal embedding dimension for each data set (except for the sunspot record). 
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Lorenz system Rossler system 




20 40 60 20 40 60 

delay time delay time 

Fig. 4. Nonlinear prediction error of local constant model v.s. delay time. Embedding 
dimensions used in the model are 4,3,3 and 5 for the Lorenz system, the Rossler 
system, the sunspot record and data set of S4 respectively. The ranges of delay time 
are all from 1 to 60 . The NLPEs corresponding to delay times in Table 1 chosen 
by different algorithms for each data set are marked with diamonds. We use the 
program zeroth in TISEAN package [14] for our calculations. 

We use the nonlinear prediction error (NLPE) to verify the reconstruction 
quality of our choice. As we have known, local constant model [16] utilises 
nearest neighbours for nonlinear prediction, when sufficiently high embedding 
dimension is reached, most of the effect of false nearest neighbours will be 
excluded. When embedding dimension and the radius of neighbour searching 
are fixed, the NLPE will only depend on delay time. Hence the NLPE can 
qualitatively determine whether our choice for an embedding reconstruction 
is acceptable, as the prediction error of a suitable delay time shall achieve 
a tradeoff between being too small and being too large if the time series is 
not completely predictable or completely unpredictable. In Fig. 4, the NLPEs 
corresponding to delay times listed in Table 1 chosen by different algorithms 
for each data set are marked with diamonds. As we can find, certain tradeoff 
for each geometric measure of RITE is indeed achieved. 

Now let us examine the computational cost of each algorithm listed in Table 1. 
Let A" denotes the data set size of time series {xi}, then the ILD algorithm ap- 
proximately requires 0(N re f x (iVlniV)) unit operations on searching nearest 
neighbours for each embedding dimension and each delay time, where N re f 
is the number of reference points. The AMI algorithm needs about 0(N 2 ) 
unit operations to calculate joint probability distribution for each delay time, 
while the RITE algorithm will be faster than both of them, undergoing about 
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Table 2 



Delay times chosen by the g 
Lorenz and Rossler systems 


geometric measures of RITE for the time series from the 
contaminited with observational Gaussian white noises 


Noise level (%) 


Lorenz system 






Rossler system 






distance 


projection 


angle 


distance projection 


angle 





9 


12 


10 


16 


14 


13 


3 


9 


12 


10 


16 


14 


13 


6 


9 


12 


9 


16 


14 


13 


9 


9 


12 


10 


16 


14 


13 


12 


9 


12 


8 


16 


14 


2 



O(N) unit operations on both the transforms over the original data set and 
the calculations of expectation for each delay time. 



We will also test the robustness of the geometric measures of RITE against 
observational Gaussian white noise N(0, 5 2 ). Noise level is denned as the ratio 
of 5 to 5 S , where 5 S is the standard deviation of the original scalar time series 
{xi} before the transform of Eqn. (7). As indicated in Table 2, using delay 
times chosen at noise level zero as the references, we find both the distance 
and the projection measures of RITE are rather robust against observational 
noise, noise level up to 12% still can not affect the choices of delay time. As 
expected, the angle measure of RITE is more sensitive to noise. For the Lorenz 
system, small fluctuations of the choice appear when noise level is higher than 
6%. For the Rossler system, the performance seems better. The odd choice 
r = 2 at noise level 12% follows our criterion suggested above, which is due 
to a small spike on the curve of the angle measure of RITE vs. delay time, 
while the next local minimum is exactly at delay time r = 13. Although 
the robustness against observational noise of the geometric measures of RITE 
might vary from system to system, we believe in general it is satisfactory. 



4 CONCLUSION 



It has been a difficult problem to set up a universal criterion for the choice of 
delay time. Average mutual information is the most preferred statistic used 
for choosing delay time since it has a valuable physical meaning, but it re- 
quires a complicated implementation algorithm. To achieve higher accuracy, 
more complex implementation and more running time are needed. Also it does 
not deal very well with short time series. Comparatively, the RITE algorithm 
intends to provide an optimal choice of delay time with the objective to guar- 
antee the reconstruction to be an embedding. Our calculates indicate that the 
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RITE algorithm performs well on a variety of time series of various lengths 
and even in the presence of substantial noise. We therefore feel that such a 
simple algorithm should be preferred to the more complex implementation 
suggested previously. 



5 ACKNOWLEDEGMENT 



This research was supported by a Hong Kong Polytechnic University Research 
Grant (No. A-PE46). 



References 

[1] F. Takens, in D. A. Rand, and L. S. Young, editors, Dynamical Systems and 
Turbulence, Lecture Notes in Mathematics Vol. 898 (Springer- Verlag, New York, 
1980). 

[2] A. M. Albano, J. Muench, C. Schwartz, A. I. Mees, and P. E. Rapp. Singular 
value decomposition and the Grassberger-Procaccia algorithm. Phys. Rev. A 
38, 3017 (1988). 

[3] A. M. Albano, A. Passamante, and M. E. Farrell, Using higher-order correlations 
to define an embedding window, Physica D 54, 85 (1991). 

[4] M. T. Rosenstein, J. J. Collins, and C. J. De Luca, Reconstruction expansion 
as a geometry-based framework for choosing proper delay times, Physica D 73, 
82 (1994). 

[5] A. M. Fraser, and H. L. Swinney, Independent coordinates for strange attractors 
from mutual information, Phys. Rev. A 33, 1134 (1986). 

[6] A. Galka, Topics in Nonlinear Time Series Analysis with Implications for EEG 
Analysis, Advanced Series in Nonlinear Dynamics, Vol. 14 (World Scientific, 
2000). 

[7] J. M. Martinerie, A. M. Albano, A. I. Mees, and R. E. Rapp, Mutual 
information, strange attractors, and the optimal estimation of dimension, Phys. 
Rev. A 45, 7058 (1992). 

[8] T. Buzug, and G. Pfister, Optimal delay time and embedding dimension for 
delay-time coordinates by analysis of the global static and local dynamical 
behavior of strange attractors, Phys. Rev. A 45, 7073 (1992). 

[9] W. Liebert, K. Pawelzik, and H. G. Schuster, Optimal embeddings of chaotic 
attractors from topological considerations, Europhys. lett. 14, 521 (1991). 



11 



[10] G. Kember, and A. C. Fowler, A correlation function for choosing time delays 
in phase portrait reconstructions, Phys. Lett. A 179, 72 (1993). 

[11] C. J. Cellucci, A. M. Albano, and P. E. Rapp, Comparative study of embedding 
methods, Phys. Rev. E 67, 066210 (2003). 

[12] W. Liebert, and H. Schuster, Proper choice of the time delay for the analysis 
of chaotic time series, Phys. Lett. A 143, 107 (1989). 

[13] M. Small, K. Judd, M. Lowe, and S. Stick. Is breathing in infants chaotic? 
Dimension estimates for respiratory patterns during quiet sleep, Journal of 
Applied Physiology 86, 359 (1999). 

[14] R. Hegger, H. Kantz, and T. Schreiber, Practical implementation of nonlinear 
time series methods: The TISEAN package, CHAOS 9, 413 (1999). 

[15] M. B. Kennel, R. Brown, and H. D. I. Abarbanel, Determining embedding 
dimension for phase-space reconstruction using a geometrical construction, 
Phys. Rev. A 45, 3403 (1992). 

[16] J. D. Farmer, and J. J. Sidorowich , Predicting chaotic time series. Phys. Rev. 
Lett. 59, 845 (1987). 



12 



